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Q^ ■ Abstract 

r^ . A daunting challenge faced by modern biological sciences is finding an efficient and 

computationally feasible approach to deal with the curse of high dimensionality. The 
f— ^ , problem becomes even more severe when the research focus is on interactions. To im- 

^^ I prove the performance, we propose a low-rank interaction model, where the interaction 

effects are modeled using a low-rank matrix. With parsimonious parameterization of in- 
teractions, the proposed model increases the stability and efficiency of statistical analysis. 
Built upon the low-rank model, we further propose an Extended Screen-and-Clean ap- 
proach, based on the Screen and Clean (SC) method (Wasscrman and Rocdcr, 2009; Wu 
et al, 2010), to detect gene-gene interactions. In particular, the screening stage utilizes a 
combination of a low-rank structure and a sparsity constraint in order to achieve higher 
power and higher selection-consistency probability. We demonstrate the effectiveness of 
the method using simulations and apply the proposed procedure on the warfarin dosage 
study. The data analysis identified main and interaction effects that would have been 
neglected using conventional methods. 
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1 Introduction 

Modern biological researches deal with high-throughput data and encounter the curse of high- 
dimensionality. The problem is further exacerbated when the question of interest focuses on 
gene-gene interactions (GxG). Due to the extremely high-dimensionality for modeling GxG, 
many GxG methods are multi-staged in nature that rely on a screening step to reduce the 
number of loci (Cordell 2009; Wu et al. 2010). Joint screening based on the multi-locus 
model with all main effect and interactions terms is preferred over marginal screening based 
on single-locus tests — it improves the ability to identify loci that interact with each other but 
exhibit little marginal effect (Wan et al. 2010) and improves the overall screening performance 
by reducing the unexplained variance in the model (Wu et al. 2010). However, joint screening 
imposes statistical and computational challenges due to the ultra-large number of variables. 
To tackle this problem, one promising method that has good results is the Screen and Clean 
(SC) procedure (Wasserman and Roeder, 2009; Wu et al. 2010). The SC procedure first 
uses Lasso to pre-screen candidate loci where only main effects are considered. Next, the 
expanded covariates are constructed to include the selected loci and their corresponding 
pairwise interactions, and another Lasso is applied to identity important terms. Finally, 
in the cleaning stage with an independent data set, the effects of the selected terms are 
estimated by least squares estimate (LSE) method, and those terms that pass i-test cleaning 
are identified to form the final model. 

A crucial component of the SC procedure is the Lasso step in the screening process for 
interactions. Let Y be the response of interest and G = {gi,- ■ ■ ,gp)^ be the genotypes at 
the p loci. A typical model, which is also the model considered in SC, for GxG detection is 

p 

E{Y\G) = 7 + J^er * + E^J'^- • ^93 9k). (1) 

j=l j<k 

where ^j is the main effect of the j^^ loci, and r/j^, j < k, is the GxG corresponding to the 
j and k loci. The Lasso step of SC then fits model (1) to reduce the model size from 

mp = l+p+Q (2) 

to a number relatively smaller than sample size, n, based on which the validity of the sub- 
sequent LSE cleaning can be guaranteed. The performance of Lasso is known to depend on 
the involved number of parameters nip and the available sample size n. Although Lasso has 
been verified to perform well for large nip, caution should be used when nip is ultra-large 
such as in the order of exp{0(n )} for some 6 > (Fan and Lv, 2008). In addition, the nip 
encountered in modern biomedical study is usually greatly larger than n even for a moderate 
size of p. In this situation, statistical inferences can become unstable and inefficient, which 



would impact the screening performance and consequently affect the selection-consistency of 
the SC procedure or reduce the power in the i-tests cleaning. 

To improve the exhaustive screening involving all main and interaction terms, we consider 
a reduced model by utilizing the matrix nature of interaction terms. Observing model (1) 
that [gj Qk) is the (j, A;)*^ element of the symmetric matrix J = GG^ , it is natural to treat 
rjjk as the {j, fc)**^ entry of the symmetric matrix r], which leads to an equivalent expression 
of model (1) as 

E{Y\G) = 7 + e^G + vecp(77)^vecp(J), (3) 

where .^ = (.^i, . . . , ^p) and vecp(-) denotes the operator that stacks the lower half (excluding 
diagonals) of a symmetric matrix columnwisely into a long vector. With the model expres- 
sion (3), we can utilize the structure of the symmetric matrix rj to improve the inference 
procedure. Specifically, we posit the condition for the interaction parameters 

Tf : being sparse and low-rank. (4) 

Condition (4) is typically satisfied in modern biomedical research. First, in a GxG scan, it is 
reasonable to assume most elements of ry are zeros because only a small portion of the terms 
are related to the response Y. This sparsity assumption is also the underlying rationale for 
applying Lasso for variable selection in conventional approaches (e.g., Wu's SC procedure). 
Second, if the elements of r/ are sparse, the matrix r/ is also likely to be low-rank. Displayed 
below is an example of rj with p = 10 that contains three pairs of non-zero interactions, and 
hence has rank 3 only: 

^ * <|fc 

• Of 03x7 

* ♦ 
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One key characteristic in our proposed method is the consideration of the sparse and low- 
rank condition (4), which allows us to express rj with much fewer parameters. In contrast. 
Lasso does not utilize the matrix structure but only assumes the sparsity of r] and, hence, still 
involves (2) parameters in rj. From a statistical viewpoint, parsimonious parameterizations 
can improve the efficiency of model inferences. Our aims of this work are thus twofold. First, 
using model (3) and condition (4), we propose an efficient screening procedure referred to 
as the sparse and low-rank screening (SLR-screening). Second, we demonstrate how the 
SLR-screening can be incorporated into existing multi-stage GxG methods to enhance the 
power and selection-consistency. Based on the promise of the SC procedure, we illustrate the 
concept by proposing the Extended Screen-and-Clean (ESC) procedure, which replaces the 
Lasso screening with SLR-screening in the standard SC procedure. 
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Some notation is defined here for reference. Let {{Yi, Gi)}^^^ be random copies of {Y, G), 
and let Jj = GiGj . Let Y = (Yi, • • • , Yn)'^ be an n- vector of observed responses, and let X = 
[Xi,--- ,Xn]'^ be the design matrix with Xi = [l,Gf ,vecp{Ji)'^]'^. For any square matrix 
M, M~ is its Moore-Penrose generalized inverse, vec(-) is the operator that stacks a matrix 
columnwisely into a long vector. Kp^^ is the commutation matrix such that Kpj, vec(iVf ) = 
vec(iVf ) for any p x k matrix M (Henderson and Searle, 1979; Magnus and Neudecker, 
1979). P is the matrix satisfying i-'vec(A^) = vecp(-/Vf) for any p x p symmetric matrix M. 
P can be chosen such that PKpp = P. For a vector, || • || is its Euclidean norm (2-norm), 
and II • 111 is its 1-norm. For a set, | • | denotes its cardinality. 

2 Inference Procedure for Low-Rank Model 

2.1 Model specification and estimation 

To incorporate the low-rank property (4) into model building, for a pre-specified positive 
integer r < p, we consider the following rank-r model 

E{Y\G) = 7 + C'^G + vecp(T7)^vecp( J), rank(r7) < r. (6) 

Although the above low-rank model expression is straightforward, it is not convenient for 
numerical implementation. In view of this point, we adopt an equivalent parameterization 
17(0) for 77 that directly satisfies the constraint rank(77) < r. Consider the case with the 
minimum rank r = 1 (the rank-1 model), we use the parameterization 

rjic/)) = uaa^ , (t)={a^,uf, a G M^, u G M. (7) 

For the case of higher rank, we consider the parameterization 

'n{cp) = AB'^ + BA^, (t) = Yec{A, Bf , A,B€W""', (8) 

which gives r = 2k (the rank-2fc model), since the maximum rank attainable by r]((p) in 
(8) is 2k. Note that in either cases of (7) or (8), the number of parameters required for 
interactions ri{(j)) can be largely smaller than (1). See Remark 1 for more explications. Thus, 
when model (6) is true, standard MLE arguments show that statistical inference based on 
model (6) must be the most efficient. Even if model (6) is incorrectly specified, when the 
sample size is small, we are still in favor of the low-rank model. In this situation, model (6) 
provides a good "working" model. It compromises between the model approximation bias 
and the efficiency of parameters estimation. With limited sample size, instead of unstably 
estimating the full model, it is preferable to more efficiently estimate the approximated 
low-rank model. As will be shown later, a low-rank approximation of 77 with parsimonious 
parameterization suffices to more efficiently screen out relevant interactions. 



Let the parameters of interest in the rank-r model (6) be 

/3W=[7,e'^,vecp{r7(0)n^ with 9 = {^,e ,cp'^f , (9) 

which consist of intercept, main effects, and interactions. Under model (6) and assuming i.i.d. 
errors from a normal distribution N{0,a'^), the log-likelihood function (apart from constant 
term) is derived to be 

1 " 1 

m = "2 E {^* - ^ - ^"^^^ - vecp{77(0)}^vecp(J.)}' = --\\Y - Xmf- (10) 

To further stabilize the maximum likelihood estimation MLE, a common approach is to 
append a penalty on 9 to the log-likelihood function. We then propose to estimate 9 through 
maximizing the penalized log-likelihood function 

where Xi is the penalty (the subscript i is for low-rank). Denote the penalized MLE as 

Ox, = (ta,, 6v,, 4>l/j = argmax 4,(^)- (12) 

The parameters of interest f3{9) are then estimated by 

A,=P{Ox,), (13) 

on which subsequent analysis for main and GxG effects can be based. In practical imple- 
mentation, we use K-fold cross-validation (K = 10 in this work) to select Xf. 

Remark 1. We only need pr — r^/2 + r/2 parameters to specify a p x p rank-r symmetric 
matrix, and the number of parameters required for model (6) is 

dr = l+p+(pr-rV2 + r/2). (14) 

However, adding constraints makes no difference to our inference procedures, hut only in- 
creases the difficulty in computation. For convenience, we keep this simple usage of (p without 
imposing any identifiability constraint. 

2.2 Implementation algorithm 

2.2.1 The case of rank-1 model 

For the rank-1 model ?7(i?!>) = uaa^ , it suffices to maximize (11) using Newton method under 
both u = +1 and u = —1. The one from u = ±1 with the larger value of penalized log- 
likelihood will be used as the estimate of 9. For any fixed u, maximizing (11) is equivalent 
to the minimization problem: 

min I- \\Y - Xu(3u{0u)\\^ + ^\\Ouf , (15) 



f with X^i = [l,Gf,u- vecp{Jif'^ 
f3u{0u) = {7,e' ,vecp(aa^ )'^}^ with 6^ = (7,^^,0^)^. Define 



where Xu = [Xui,-'' ,Xun] with Xui = [l,Gj ,u- vecp(Ji) ] is the design matrix, and 



Wu{9u) = Xu — 7:-. with 



Ip+i 

2P{a®Ip). 



dOu dOu 

The gradient and Hessian matrix (ignoring the zero expectation term) of (15) are 

guiOu) = -{Wu{eu)V{Y-XuPu{Ou)} + \ieu, 

Hu{9u) = {Wu{9u)f{Wu{9u)} + Xil2p+i. 
Then, given an initial 9u , the minimizer 9^ of (15) can be obtained through the iteration 
9t'^=9(^^-{Hu{9^'>)y\M^),t = 0,1,2,..., (16) 

until convergence, and output 9u = 9u ■ Let u* correspond to the optimal u from u = ±1. 
The final estimate is defined to be 9x^ = {9u*,u*)^. 

2.2.2 The case of rank-2fc model 

When ri{(j)) = AB + BA , we use the alternating least squares (ALS) method to maximize 
(11). By fixing A, the problem of solving B becomes a standard penalized least squares 
problem. This can be seen from 

vecp(AB'^ + BA'^) = 2Pvec(AB^) = 2P{B (g) /p)vec(A), 

where the second equality holds by PKpp = P. Hence, maximizing (11) with fixed B is 
equivalent to the minimization problem: 

rmu\\\Y-XB9B\\' + ^\\9Bf, (17) 

where Xb = [Xbi, • • • , XbuV with Xbi = [1, Gj , 2vecp{Ji)^P{B Ip)]'^ being the design 
matrix when B is fixed, and 9b = [7, ^'^, vec(A)^l . It can be seen that (17) is the penalized 
least squares problem with data design matrix Xb and parameters 9b, which is solved by 

9b = {X^Xb + A^/i+p+pfc) X^Y. (18) 

Similarly, the maximization problem with fixed A is equivalent to the minimization problem 

mm -\\Y - Xa9 All + ^\\9a\\ , 

where Xa = [Xia, ■ ■ ■ , XnA^^ with XiA = [1, Gf , 2vecp( Jj)-^P(A (g) Ip)]'^ being the design 
matrix when A is fixed, and 9a = [7,^^, vec(S)-'"] . Thus, when A is fixed, 9a is solved by 

9a = {xIXa + Xeli+p+pk) "' XlY. (19) 



The ALS algorithm then iteratively and alternatively changes the roles of A and B until 
convergence. Detailed algorithm is summarized below. 

Alternating Least Squares (ALS) Algorithm: 

1. Set initial B'^°\ For t = 0, 1, 2, ... , 

(1) Fix B = SW, obtain e^(t) = {tW,^^*), vec(A(*+i))^}^ from (18). 

(2) Fix A = A(*+i), obtain 9^it+i) = {7(*+^),e(*+i), vec(S(*+i))^}^ from (19). 

2. Repeat Step-1 until convergence. Output (7(*+i),^(*+i), a(*+^\ 5^*+^)) to form Ox,- 

Note that the objective function value increases in each iteration of the ALS algorithm. In 
addition, the penalized log-likelihood function is bounded above by zero, which ensures that 
the ALS algorithm converges to a stationary point. We found in our numerical studies that 
a random initial -B' ' will converge quickly and produce a good solution. 

2.3 Asymptotic properties 

This subsection devotes to derive the asymptotic distribution of /3x^ defined in (13), which 
is the core to propose our SLR-screening in the next section. Assume that the parameter 
space of is bounded, open and connected, and define E = /3(G) be the induced parameter 
space. Let /3o = {70;C(f ; vecp(T7o)"^}^ be the true parameter value of the low-rank model (6) 
and define 

A(^) = ^m- (20) 

We need the following regularity conditions for deriving asymptotic properties. 
(CI) Assume /3o = /3(6'o) for some 6*0 G 9. 

(C2) Assume that (3{6) is locally regular at 6q in the sense that A(0) has the same rank as 
A(0o) for all in a neighborhood of ^o- Further assume that there exists neighborhoods 
U and V of 6*0 and /3o such that H n V = f3{U). 

(C3) Let Vn = —XX. Assume that Vn — ?• ^o and that Vq is strictly positive definite. 

The main result is summarized in the following theorem. 

Theorem 2. Assume model (6) and conditions (C1)-(C3). Assume also Xi = o(-^/n). Then, 
as n ^ oo, we have 

\/^(3a, -/3o)^iV(0,So), (21) 

where Sq = (^"^ ^oi^oV o^o)~ ^1) with Aq = A(6'o). 



To estimate the asymptotic covariance XIqj we need to estimate (cr^,Ao). The error 
variance a"^ can be naturally estimated by 

a = , {22} 

n — dr 

where dr is defined in (14). We propose to estimate Aq by Aq = A.[0\^). Finally, the 
asymptotic covariance matrix in Theorem 2 is estimated by 

% = a^d.Au (k + ^iAuA aI, (23) 



n 



T, 



where UAU is the singular value decomposition of Ag l^„Ao, A G ]^f^rxdr jg i^j^g diagonal 
matrix consisting of dr nonzero singular values with the corresponding singular vectors in 
U. We note that adding —Id,, to A in (23) aims to stabilize the estimator Sq, and will not 
affect its consistency to So- 

Remark 3. The number dr in (22) can be used as a guide in determining how large the 
model rank is allowed with the given data size n. That is, the value n — dr should be adequate 
for error variance estimation. 

3 Multistage Variable Selection for Genetic Main and GxG 
Effects 

By the developed inference procedure of low-rank model, we introduce in Section 3.1 the 
SLR-screening. In Section 3.2, the SLR-screening is incorporated into the conventional SC 
procedure to propose ESC for GxG detection. 

3.1 Sparse and low-rank screening 

Due to the extremely high dimensionality for GxG, a single-stage Lasso screening is not 
adequately flexible enough for variable selection. To improve the performance, it is helpful 
to reduce the model size from rup to a smaller number. The main idea of SLR-screening is to 
fit a low-rank model to filter out insignificant variables first, followed by implementing Lasso 
screening on the survived variables. The algorithm is summarized below. 

Sparse and Low- Rank Screening (SLR-Screening): 

1. Low- Rank Screening: Fit the low-rank model (6). Based on the test statistics for 
/3o, screen out variables to obtain the index set Xlr. 



2. Sparse (Lasso) Screening: Fit Lasso on Xlr. Those variables with non-zero esti- 
mates are identified in Xslr- 

The goal of Stage-1 in SLR-screening is to screen out important variables by utilizing the 
low-rank property of rj. To achieve this task, we propose to fit the low-rank model (6) to 
obtain /3x^ and Sq. Based on Theorem 2, it is then reasonable to screen out variables as 



IhR 



\SM£^ > ae } (24) 



n 1 So,j 



for some a£ > 0, where l^x^j is the j element of (3x^, and So,j is the j diagonal element of 
5^0 • Here the threshold value ai controls the power of the low-rank screening. 

The goal of Stage-2 in SLR-screening is to enforce sparsity. Based on the selected index 
set Xlr, we refit the model with 1-norm penalty through minimizing 

l\\Y-Xx^^Px^J' + X,\\PI^J^, (25) 

where Xx^^^^ and /Jx^j^ are, respectively, the selected variables and parameters in Xlr, and Xg 
is a penalty parameter for sparsity constraint. Let the minimizer of (25) be /Sx^j^, and define 

2:sLR = {j G Xlr : h^^j / o} (26) 

to be the final identified main effects and interactions from the screening stage, where /3xlrj 
is the j'*^ element of /3xlr- To determine A^, the K-fold cross-validation {K = 10 in this 
work) is applied. Subsequent analysis can then be conducted on those variables in Xslr- 

3.2 Extended Screen-and-Clean for GxG 

Screen-and-Clean (SC) of Wasserman and Roeder (2009) is a novel variable selection proce- 
dure. Firstly, the data are split into two parts, one for screening and the other for cleaning. 
The main reason of using two independent data sets is to control the type-I errors while 
maintaining high detection power. In the screening stage. Lasso is used to fit all covariates, 
of which zero estimates are dropped. The threshold for passing the screening is determined 
by cross-validation. In the cleaning stage, a linear regression model with variables passing 
the screening process is fitted, which leads to the LSE to identify significant covariates via 
hypothesis testing. A critical assumption for the validity of SC is the sparsity of effective 
covariates. As a consequence, by using Lasso to reduce the model size, the success of the 
cleaning stage in identifying relevant covariates is guaranteed. 

Recently, SC has been modified by Wu et al. (2010) to detect GxG as described in Sec- 
tion 1. This procedure has been shown to perform well through simulation studies. However, 



the procedure can be less efficient when the number of genes is large. For instance, there 
could be many genes remain after the first screening and, hence, a rather large number of 
parameters is required to fit model (1) for the second screening. As the performance of Lasso 
depends on the model size, a further reduction of model size can be helpful to increase the 
detection power. To achieve this aim, unlike standard SC that fits the full model (1) with 
Lasso screening, we propose to fit the low-rank model (6) with SLR-screening instead. We 
call this procedure Extended Screen-and-Clean (ECS). Let G* be the set of all genes under 
consideration. Given a random partition "Di and 2^2 of the original data P, the ESC proce- 
dure for detecting GxG is summarized below. 

Extended Screen-and-Clean (ESC): 

1. Based on Pi, fit Lasso on {Y,G*) to obtain ^q* with the 1-norm penalty A^. Let G 
consist of genes in {j : Cg*,j / 0}. Obtain £{G) = G U {aU interactions of G}. 

2. Based on Pi, implement SLR-screening on {Y,8{G)) to obtain Xslr. Let S consist of 
main and interaction terms in Xslr- 

3. Based on T>2^ fit LSE on (Y^S) to obtain estimates of main effects and interactions ^5 
and rjg. The chosen model is 

M = \^gj,gkgi e S -. \Tj\ > t„„i„|5|^_i|_, \Tki\ > t„_i_|5|,_i|^|, 

where Tj and Tm are the t-statistics based on elements of S,s and fj^, respectively. 

For the determination of A,n in Step-1 of ESC, in Wu et al. (2010) they use cross-validation. 
Later, Liu, Roeder and Wasserman (2010) introduce StARS (Stability Approach to Regu- 
larization Selection) for A^ selection, and this selection criterion is adopted in the R code 
of Screen & Clean (available at http://wpicr.wpic.pitt.edu/WPICCompGen/). Note that 
the intercept will be included in the model all the time. Note also that the proposed ESC 
is exactly the same with Wu's SC, except SLR-screening is implemented in Step-2 instead of 
Lasso screening. See Figure 1 for the flowchart of ESC. 

4 Simulation Studies 

Our simulation studies are based on the design considered in Wu et al. (2010) with some 
extensions. In each simulated dataset, we generated genotype and trait values of 400 indi- 
viduals. For genotypes, we generated 1000 SNPs, G = [gi,--- ,51000]"^ with gj G {0,1,2}, 
from a discretization of normal random variable satisfying P{gj = 0) = P{gj = 2) = 0.25 

10 



and P{gj = 1) = 0.5. The 1000 SNPs can be grouped into 200 5-SNP blocks, with which 
SNPs from different blocks are independent and SNPs within the same block are correlated 
with R^ = 0.3^. Conditional on G, we generate Y using the following 4 models, where /? is 
the effect size and e ~ A^(0, 1): 

Ml: Y = /3(g556 + O.Sgiogii + O.Qgi^gia + OAg2og2i + 0.2g25526) + £• 

M2: Y = I3{g^g& + O.S^io^/ii + 0.6(7i55i6 + 2^20 + 2^21 ) + e. 

M3: Y = /3vecp(r7)^vecp( J) + e, r/^-fc = 0.9l^~*^l for 1 < j / A; < 6 and r]jk = for j, A; > 6. 

M4: Y = /3vecp(77) vecp(J) + e, where we randomly generate rjjk = sign(ni) • U2 with 
ui ~ C/(-0.1, 0.9) and U2 ~ C/(0.5, 1) for 1 < j / A; < 8, and lyu = for j, A; > 8. 

To compare the performances, let Mq denote the index set of nonzero coefficients of the 
true model, and let M be the estimated model. Define the power to be E{\M n7Wo|/|-^o|)) 
the exact discovery to be P{A4 = A^o)i the false discovery rate (FDR) to be E{\A4 n 
A^3I/I-^I)' ^^"^ the type-I error to be P{M D Mq ^ 0). These quantities are reported with 
100 replicates for each model. 

Simulation results under different model settings are placed in Figures 2-5. It can be seen 
that both ESC(l) and ESC(2) can control FDR and type-I error adequately in all settings. In 
the pure interaction model Ml, ESC(l) is the best performer, while the performances of SC 
and ESC(2) are comparable. Interestingly, when the true model contains main effects (M2, 
Figure 3), both ESC(l) and ESC(2) do outperform SC obviously for every effect size /3. It 
indicates that conventional SC using model (1) is not able to identify main effects efficiently. 
We found SC procedure is more likely to wrongly filter out the true main effects in the 
second Lasso screening stage. However, with the low-rank screening to reduce the model 
size, these true main effects have higher chances to enter the final LSE cleaning and, hence, a 
higher power of ESC is reasonably expected. The superiority of ESC procedure can be more 
obviously observed under models M3-M4 (Figures 4-5), where the powers and exact discovery 
rates of ESC(l) and ESC(2) dominate that of SC for every effect size (3. One reason is that 
there are many significant interactions involved in M3-M4, and ESC with a low-rank model 
is able to correctly filter out insignificant interactions in rj to achieve better performances. 
In contrast, directly using Lasso screening does not utilize the matrix structure of rj. On one 
side, it tends to wrongly filter out significant interactions. On the other side, it tends to leave 
too many insignificant terms in the screening stage. Consequently, the subsequent LSE does 
not have enough sample size to clean the model well, and results in lower detection powers. 

We note that although the rank of t] in models M1-M4 ranges from 6 to 8, ESC with rank- 
1 and rank-2 models suffice to achieve good performances. It indicates the robustness and 
applicability of the low-rank model (6), even with an incorrectly specified rank r. Moreover, 



11 



we observe that ESC(l) outperforms ESC(2) in most of the settings. Given that the aim of 
low-rank screening in SLR-screening is to reduce the model size, a good approximation of t] 
is capable to remove non- important terms. In contrast, while the rank-2 model approximates 
T] more precisely, it also requires more parameters in model fitting. With limited sample 
size, the gain in approximation accuracy from rank-2 model cannot compensate the loss in 
estimation efficiency and, hence, ESC(2) may not have a better performance than ESC(l) 
does. See also Remark 3 for the discussion of selecting r in ESC procedure. 
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Figure 1: Flowchart of ESC for detecting GxG. The arrow indicates which part of the data 
is used. The case of SC replaces SLR-screening by Lasso screening. 
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Figure 2: Simulation results under Ml. 
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Figure 3: Simulation result under M2. 
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Figure 4: Simulation result under M3. 
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Figure 5: Simulation result under M4. 
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